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LOCAL ANTITHETIC SAMPLING WITH SCRAMBLED NETS 



By Art B. Owen 

Stanford University 

We consider the problem of computing an approximation to the 
integral / = J^^ f{^) dx. Monte Carlo (MC) sampling typically at- 
tains a root mean squared error (RMSE) of 0{n-^''^) from n m- 
dependent random function evaluations. By contrast, quasi-Monte 
Carlo (QMC) sampling using carefully equispaced evaluation points 
can attain the rate 0(n~^+'^) for any e > and randomized QMC 
(RQMC) can attain the RMSE 0(n"^/^+^), both under mild condi- 
tions on /. 

Classical variance reduction methods for MC can be adapted to 
QMC. Published results combining QMC with importance sampling 
and with control variates have found worthwhile improvements, but 
no change in the error rate. This paper extends the classical vari- 
ance reduction method of antithetic sampling and combines it with 
RQMC. One such method is shown to bring a modest improvement in 
the RMSE rate, attaining 0(n"^/^"^''''+'^) for any e > 0, for smooth 
enough /. 

1. Introduction. Many problems in science and engineering require mul- 
tidimensional quadratures. There we seek the value of an integral / = 
fp^df{x)dx. The integrand / subsumes any transformations necessary to 
account for noncubic domains, or integration with respect to a nonuniform 
density. Monte Carlo sampling is often employed for these problems. Its 
basic form uses an estimate / = {^/n)Y^^=if{xi), where simulated 
independent draws from ?7[0, 1]"'. When / is in L^, then Monte Carlo has a 
root mean squared error (RMSE) at the famihar 0(n-i/2) 

rate. 

Monte Carlo integration can be improved by the use of variance reduc- 
tion methods. Well-known techniques include stratification, importance sam- 
pling, control variates and antithetic sampling. These are described in texts 
such as Glasserman [10] and Fishman [8]. 
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In stratification, the sample points made more uniformly 

distributed than they would be by chance. This idea of choosing points more 
uniformly than they would be by chance underlies quasi-Monte Carlo (QMC) 
sampling which can be thought of as an extreme version of stratification. 
Deterministic QMC methods can attain an error rate of 0(n~^^^), while 
randomized versions can achieve an RMSE of 0(n~^/^^^), both under mild 
smoothness conditions on /, for any e > 0. 

It is interesting to investigate whether variance reduction techniques from 
MC bring any advantages to the QMC setting. Chelson [3] and Spanier and 
Maize [27] have investigated QMC with importance sampling. Hickernell, 
Lemieux and Owen [12] have studied the combination of QMC with con- 
trol variates. This paper considers a combination of QMC with antithetic 
sampling. 

Antithetic sampling improves Monte Carlo by exploiting spatial struc- 
ture in /. Each point x € [0,1]*^ is coupled with another x, commonly ob- 
tained as X = 1 — X interpreted componentwise. In practice, we average 
f{xi) = {f{xi) + f{xi))/2 at n/2 points Xj. If f{x) is linear in x, then 
f{xi) = I and / can be estimated without error. When f{x) is nearly linear 
or nearly antisymmetric [i.e., f{x) —1 = 1 — /(x)], then antithetic sampling 
can bring a great reduction in RMSE, although the rate remains n~^/^. In 
local antithetic sampling, described below, the point x is always close to x. 
Since smooth functions are locally linear in the Taylor approximation sense, 
local antithetic sampling can be much better than antithetic sampling for 
small d. 

This paper considers several ways of combining antithetic sampling and 
randomized digital nets. The main result is that one such method, a box 
folding scheme, reduces the RMSE to 0{n~^/'^~^/'^~^^). The improvement in 
rate is modest and diminishes with d. But it compares favorably with or- 
dinary antithetic sampling which only changes the constant in the RMSE, 
and changes it for the worse for some /. The other variance reduction meth- 
ods from MC (control variates and importance sampling) only act on the 
constant and do not improve the RMSE rate when applied to randomized 
QMC. 

The improvement we find is the same factor n~^^'^ from classic results of 
Haber [11]. Haber gets an RMSE rate of 0{n~^^^~^^^) for cubically stratified 
sampling and it improves to 0(n~^/^~^/'^) for a locally antithetic version of 
that sampling. 

The outline of this paper is as follows. Section 2 summarizes background 
information on scrambled nets, which are a form of randomized quasi-Monte 
Carlo sampling. Section 3 introduces some new notions of d-dimensional 
folding operations used to introduce local antithetic properties into digital 
nets, and proposes three specific methods. Section 4 illustrates several re- 
flection net sampling schemes on a two-dimensional integrand studied by 
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[25]. The root mean squared errors seem to follow a rate. The next sec- 
tions are devoted to showing that one of the methods, box folding, attains 
an RMSE of 0(n~^/^~^/'^+^). Section 5 recaps the variance for scrambled 
net quadrature of smooth functions. It corrects an error in the proof of the 
0(?i"^/2(logn)('^~^)/2) RMSE rate from [21]. It also extends the proof there 
to a wider collection of digital nets and uses a weaker smoothness condition 
than the earlier paper had. Section 6 builds on Section 5 to prove that the 
RMSE of the box folding scheme is 0(n~^/^~-^/'^(logn)(''~-^)/^) in d dimen- 
sions. More smoothness is required for this result than for the unreflected 
scrambled nets. Section 7 presents the box folding scheme as a hybrid of a 
monomial cubature rule with scrambled net sampling. Finally, it discusses 
how one might make use of these findings in higher dimensional problems 
of low effective dimension. 

2. Background and notation. Scrambled nets are a particular form of 
randomized quasi- Monte Carlo sampling. The monograph [17] by Nieder- 
reiter is the definitive source for quasi-Monte Carlo sampling. Randomized 
quasi- Monte Carlo sampling was surveyed by Lemieux and L'Ecuyer [15]. 
Scrambled nets were first proposed in [19] . 

We use superscripts for components, so x,Xi € [0, l]'^ have components 
and xl respectively for j = 1, . . . , d. The set {1, . . . , d} is abbreviated l:d. If 
ti C 1 : d, then its complement {\- < j < d \ j ^u} is written as — n. 

We often have to extract and combine components from one or more 
points in [0, 1]"^. When we extract the components x^ for j S u ^ 1 : d, we 
use x^ to denote the result. When x,2; S [0, 1]*^ and we want to combine 
with , we write it as : . Thus, : is the point y € [0, l]'^ with 

= for j G tt and = for j ^ u. 

2.1. Quasi-Monte Carlo. Like plain Monte Carlo, quasi-Monte Carlo 
sampling estimates an integral / = /jq f{x) dx by the average I = J27=i fi^i) 
taken over points Xi £ [0, l]*^. QMC aims to be better than random by select- 
ing Xi to be even more uniformly distributed than random points typically 
are. To quantify the nonuniformity of xi, . . . ,Xn, consider the local discrep- 
ancy function 



(1) 



1 " 



Vol([0,x]) 



i=l 



for X € [0, 1] . The star discrepancy of xi 



(2) 



L>*(xi,...,x„) = sup \6{x)\. 
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When d= 1, then Z?* reduces to the Kolmogorov-Smirnov distance between 
the empirical distribution of Xi and the U[0, 1] distribution. The Koksma- 
Hlawka inequahty [13] is 

(3) \I-I\<D*^{xi,...,Xn)\\f\\HK, 

where ||/||hk is the total variation of / in the sense of Hardy and Krause. 
It is possible to construct Xi so that D* < (log n)'^~^/n for n > 1. With 
such constructions, |/ — /| = 0{n~^~^'') holds for all e > 0, under the mild 
condition that ||/||hk < oo. Thus, QMC has a far better asymptote than 
MC. 



2.2. Digital nets. Digital nets attain their low discrepancy by being si- 
multaneously stratified for many different stratifications of [0,1]'^. Those 
stratifications are defined through hyper-rectangular subsets known as ele- 
mentary intervals. 

This section defines these elementary intervals and some digital nets and 
digital sequences. Throughout we use b to denote an integer base in which to 
represent real numbers, d to represent the dimension, kj to represent some 
nonnegative integer powers of b and tj to represent some nonnegative integer 
translations. 



Definition 1. Let b>2 and d > 1 be integers. Let k = (A;i, . . . , fc^) and 
T = {ti, . . . ,td) be d- vectors of integers for which kj > and < t j < b''^ . 
Then the set 

is a base b elementary interval. 



If one fixes k and varies r, the sets B^^r provide a tiling of [0, 1)"^. The 
tilings of the three illustrations in Figure 1 are of this type. 

The volume of B^^r is 6"'"', where \k\ = ki + ■ ■ ■ + k^- The closure of 
BkjT) defined by replacing the half open intervals in Definition 1 by closed 
intervals, is denoted B^^r- The center of B^^t and of B^^r is the point Ck,t 
with ci^^ = {tj + 1/2) /b'^i. 

When one or more of the kj is 0, then the corresponding factors of B 
reduce to [0, 1). Let uOl:d and let k be a vector of length |n| indexed by 
j € u, with component kj for j S u. Similarly, let r have components tj for 
j Then 
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will be used below. The center of B^^^.r is the point Cu,k,t with 

tj + 1/2 



^U.K.T \ ^ 

2' 



The elementary interval ]Bk,t in Definition 1 has volume b ''^L Ideally it 
should get n^"!*^! of the sample points xi,...,Xn- If that happens for one 
vector K, we have a stratified sample with one stratum for each r. Digital 
nets attain such stratification for multiple n simultaneously. 



Definition 2. For integers m > q >0, b>2 and d>l, a. sequence of 
points xi, . . . , xi,m G [0, 1)'^ is a {q, m, d)-net in base b if every base b elemen- 
tary interval in [0,1)'^ of volume contains precisely b'^ points of the 
sequence. 



The parameter q defines the quality of the net, with smaller values im- 
plying better equidistribution, and q = being the very best when it is 
attainable. The minT system [24] identifies the best known nets (smallest 
q) given the values of m, d and b. The net property is enough to ensure low 
discrepancy: 



Theorem 1. If xi, . . . ,Xn are a {q,m,d)-net in base b, then 

for n> 1, where the implied constant in the error term depends only on b 
and d. 



Proof. This is from Theorem 4.10 of [17]. The multiple of (logn)"'"^ 
can be reduced somewhat when d = 2 and b is even, or when d = 3,4 and 
6 = 2. □ 



Some constructions of digital nets are extensible. They let us increase n, 
keeping the stratification property and retaining the earlier function evalu- 
ations. 



Definition 3. For integers q>0, b>2, and d > 1, an infinite sequence 
of points Xi € [0,1)'^ for z > 1 is a (g, d)-sequence in base b if every subse- 
quence Xrb^^^i, . . . ,Xrh^j^b^, for integers m> q and r > 0, is a {q,m,d)-net 
in base b. 
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It is convenient to work with the first n = Aft™" points of the sequence. 
Should they prove inadequate, one can increase A or, more generaUy, use 
n = Xb"^ > n. The points of the new larger rule include all those of the 
previous rule. Thus, (g, (i)-sequences provide extensible integration rules. 
They automatically satisfy the {X,q,m,d)-net property: 

Definition 4. For integers m>g>0, b>2, 1 < X <b and d>l, a 
sequence of points xi, . . . , xx^m g [0, 1)*^ is a (A, q, m, d)-net in base b if every 
base b elementary interval in [0, l)*^ of volume 6'?"'" contains precisely A6'^ 
points of the sequence and no 6-ary box in [0, l)*^ of volume contains 
more than b'' points of the sequence. 

A relaxed (A, q, m, s)-net in base b is as above, except that A > 6 is allowed 
and boxes of volume may have more than W points of the sequence. 

2.3. Random digital scrambles. In scrambled digital net quadrature we 
take a digital net ai, . . . , a„ G [0, 1]*^ and apply a randomizing transformation 
to this ensemble to produce points xi,...,Xn G [0,1]'^ with two properties: 
each Xi is individually C/[0, l]'^ distributed, and collectively a 

digital net with probability 1. The first property makes the sample average 
/ = ^ J27=i fi^i) unbiased estimate of /. The second property means that 
/ inherits the good accuracy properties of digital nets. 

Some such randomized nets were presented in [19] where it was also shown 
that scrambled digital sequences remain digital sequences with probability 
one. The original motivation for randomizing nets was that it allowed inde- 
pendent replications for the purposes of estimating error. That randomiza- 
tion can improve the error rate was at first a surprise, but is now understood 
as an error cancellation phenomenon. 

Randomizations of nets typically use the same random procedure on each 
point in order to yield the corresponding Xi , and so we need only describe 
the randomization of a single point a G [0, l]'^. Furthermore, the random- 
izations applied to components through are typically chosen to be 
statistically independent. And so we only need to describe the randomiza- 
tion of a single point a G [0, 1]. 

It is beyond the scope of this article to explain how randomization of nets 
is able to achieve the two defining properties. For that one can consult the 
proposal of Owen [19], it's derandomization by Matousek [16], and the survey 
of Lemieux and L'Ecuyer [15]. We can, however, look at the mechanics of 
some randomizations. 

To scramble the point a G [0, 1), we first write it out in base 6 as a = 
J2k^i (^{k)b~^, where a(fc) G {0, 1, . . . , 6 — 1}. Some values of a have two repre- 
sentations, one ending in infinitely many zeros and the other ending in 6 — I's. 
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In such cases we use the representation ending in zeros. For this reason we do 
not scramble the value a = 1 , and so scrambled nets actually produce points 
Xi € [0, l)'^ from points € [0, l)'^. This presents no problem. The standard 
net constructions yield points in [0, 1)"^ and /jq ^^^d f{x) dx = /jq f{x) dx. 

The scrambled version of a is the point x = ^(k)b~^ for digits x^^,) G 
{0,1,..., 6— 1} obtained by random permutation schemes applied to the 
a^^j). In practice, the expansion of x is truncated. 

There are h\ distinct permutations of {0, 1, . . . , 6 — 1}. In a uniform random 
permutation of this set, each permutation has probability hi. The method 
in [19] uses a great many uniform random permutations to scramble a. One 
permutation is applied to the first digit yielding xj-^) = 7ri(a(i)). For the kth 
digit a(fc), one of independent uniform random permutations is used to 
make X(fc), chosen based on the value of \b^~^a\. 

The original randomization is computationally burdensome, requiring con- 
siderable storage. Matousek [16] found an alternative and less costly scram- 
bling, by derandomization. We describe that and several other scramblings 
here. Some more scramblings are described in [23] from which the permuta- 
tion and scrambling nomenclature used here is taken. 

Definition 5. If 6 is a prime number, then a linear random permutation 
of {0, 1, . . . , 6 — 1} has the form '/r(a) = hx a+gmodb, where h € {1, . . . ,b — l} 
and g € {0,1,..., 6 — 1} are independent random variables uniformly dis- 
tributed over their respective ranges. 

Linear permutations are restricted to prime b because otherwise there are 
nonzero h for which h x a + g is not a permutation. For example, consider 
6 = 4 and h = 2. Linear permutations have a generalization, via Galois field 
arithmetic, to bases that are prime powers, but we do not use them here. 

Definition 6. For a prime base 6, an affine matrix scramble takes the 
form 

k 

X{k) = Ck + Yl ^kja(j) mod 6, 
where Ck and M^j are in {0, 1, . . . , 6 — 1}. 

We will consider affine matrix scrambles in which the are indepen- 
dent uniformly distributed elements of {0,1,.. .,6 — 1}, independent of the 
elements M^j. Such scrambles always have x~ f^[0, 1] regardless of a and 
Mkj. 

The matrix scrambles we consider differ in the structure of the matrix M. 
In each case M is lower triangular and invertible. Invertibility is required 
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so that distinct points a lead to distinct points x. The structures that we 
consider for M can be represented as 

\ 



(4) 



/hi 






921 


h2 




931 


932 


hs 


541 


942 


543 hi 


V ; 






/hi 






hi 


h2 




hi 


h2 


hs 


hi 


hz 


/l3 /i4 









/hi 






\ 


92 


hi 






93 


92 


hi 




54 


93 


92 hi 




V ■ 






■■■) 



J 

where /I's are sampled from {1,2, . . . ,b — l} and g's are sampled from {0, 1, . . . , 
6—1}. Within each matrix, entries with the same symbol are identical and 
entries with different symbols are sampled independently. The matrices in 
(4) describe respectively, random linear scrambling of [16], /-binomial scram- 
bling of [30] and affine striped matrix (ASM) sampling from [23]. 

Random linear scrambling leads to the same sampling variance as the 
original net scrambling in [19] (called "nested uniform scrambling") but 
requires much less storage, /-binomial scrambling also leads to the same 
sampling variance but does so with still less storage. 

The ASM scrambling is not variance equivalent to nested uniform scram- 
bling. In the case d = 1, ASM attains an RMSE of 0(n~^), when f"{x) 
is bounded, which is better than the rate 0(n~^/^) from other scrambles, 
though not as good as the rate 0(n~^/^) that Haber's method gets for d=l. 

Our strategy for improving randomized nets is to build in directly some 
d-dimensional versions of locally antithetic sampling. The local antithetic 
sampling strategy is implemented by adjoining to the scrambled net certain 
reflections of sample points. 



2.4. ANOVA. For a function / G L2[0,1]'^, the ANOVA decomposition 
is available to quantify the extent to which / depends primarily on lower 
dimensional projections of the input space. Informally it is like embedding 
a regular K'^ grid in [0, l]'^, running an ANOVA on that grid and letting 
K ^ oo. The ANOVA of [0, l]'' was introduced by Hoeffding [14], figures in 
the Efron-Stein inequality [6], and was independently discovered by Sobol' 
[26]. For more details and the early history of the ANOVA decomposition, 
see [29]. 

We write /(x) = X^uci: d fui^), where fu{x) is a function of x that depends 
on X only through x". To get f^, we subtract strict sub-effects fy for w C u 
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and then average the residual over x Specifically, 




The ANOVA terms are orthogonal in that / fu{x)fy{x) dx = for subsets 
uj^v. Letting cr^ = / fu{x)^ dx, we find that o"^ = J2\u\>o ^u- 

2.5. Smoothness and mixed partial derivatives. This section introduces 
our notion of smoothness for / and records some elementary consequences 
of the definition for later use. The mixed partial derivative of / taken once 
with respect to x^ for each j E n is denoted by 5" with the convention that 

fix) = fix). 

Definition 7. The real valued function fix) on [0,1]'^ is smooth if 
fix) is continuous on [0, 1]*^ for all n C 1 : d. 

Remark 1. There are orders in which the mixed partial derivative 
d'^fix) can be interpreted. The continuity conditions in Definition 7 are 
strong enough to ensure that all orderings give the same function. 

Lemma 1. If f is smooth, then d^fuix) is continuous for all u C 1 : d. 

Proof. The details are omitted to save space. The key is to prove by 
induction on \u\ that 9"/ fix)dx~^ = J d^fix)dx~^. □ 

We also need a version of the fundamental theorem of calculus. For points 
a, 6 G [0, l]'^, define their rectangular hull as the Cartesian product 

d 

rect[a,6] = [min(aj, max(aj, 

For (i = 1 , if / has a continuous derivative /' on the interval rect [c, x] , then 
fix) = /(c) + /[^ .^j f'iy) dy, with the interpretation that /[^ means - /[^^^j 
when c> X. For general d and smooth /, we have 

(6) fix)= Yl I 5V(c-":y")d2/^ 

Here denotes ± /rectfc^.x"] where the sign is negative if and only if 

c' > x^ holds for an odd number of indices j Gu. The term for u = equals 
/(c) under a natural convention. 
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More generally, let if C {1, . . . , d} and suppose that d^f is continuous for 
u^w. Then 

(7) f{x) =Y, f : c"'-" : y^) dy\ 

For f C C {1, . . . , d}, let d^''"f denote the partial derivative of /" taken 
once with respect to each for j &v. That is, is / differentiated with 
respect to twice for j in v and once for j in u — v. 

Definition 8. The real valued function f{x) on [0, 1]*^ is doubly smooth 
if d^''" f{x) is continuous on [0, l]'^ for all w C u C 1 : d. 

3. 6-ary reflections and folds. Antithetic sampling is implemented via 
reflections about the center point of [0, 1]"^. To induce various local anti- 
thetic properties, we will use reflections of a point x about the center of an 
elementary interval containing x. 

The case d = 1 is simplest. The point x € [0, 1) belongs to the interval 
[tb~^ , {t + l)b~^), where t = t{x) = [b''x\ . The center of this interval is c = 
Ck{x) = (t + l/2)b~^. The kth order reflection of x is TZk{x) = 2cfc(x) — x. 
The value A: = corresponds to the simple reflection 1 — x. 

If the base b expansion of x € [0, 1) is x = YldLi X[i)b~^ with each X(£) G 
{0,1,..., 6— 1}, using trailing O's when x has two base b representations, 
then 

k oo 

(8) 7^fe(x) = ^X(,)6-^+ (^-1-X(,))6-^ 

i=l l=k+l 

The reflection TZk leaves the first k digits of x unchanged and it flips the 
trailing digits. 

By convention, we take 7^fc(l) = linia;_>i 7^fc(x) = 1 — 1/6*"'. Under this con- 
vention we find that limfc^oo^fe(a^) = x holds uniformly in x. The reflection 
is nearly idempotent because 7^fc(7^fc(x)) = x unless x = for an integer 
t with Q <t <b^ — 1. Note that a reflection of a reflection is not generally 
a reflection. For instance, when x is not of the form tb~^ , then TZi{TZ^{x)) 
flips digits 4 through 7 inclusive of x and leaves all other digits unchanged. 

It is useful to consider transformations in which some components of x 
are reflected, while others get an identity transformation. For simplicity, we 
adopt the special value A; = — 1, sometimes displayed simply as — , to denote 
the identity transformation, so that TZ-i{x) = x for x € [0, 1]. 

Definition 9. For the vector k = {ki, . . . ,kd) with /cj G {— 1, 0, 1, . . .}, 
the reflection 71^ of x G [0, 1]*^ is defined by 

(9) 7^«(x) = zG [0,1]^ where zJ■=7^fe^,(xJ■). 
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Figure 1 illustrates some reflections '^(1,2) 7^(-,2) for x S [0, 1)^ with 
6 = 2, as well as a box fold described below. Geometrically, a reflection of x 
has some components symmetric about the center of an elementary interval 
containing x and all other components equal to the corresponding ones of 

X. 

Recall that the center of the elementary interval B^^t is the point 

h + 1/2 trf + i/2 \ 

For a vector k = (/ci, . . . , kd) with kj > 0, the point x G [0, l)'^ belongs to the 
elementary interval B^^r for t = t{k, x) = [b'^x\ , with the multiplication and 
floor operators taken componentwise. For such k, the reflection TZk{x) may 
be written 

Notice that TZ^ {x) has some points of discontinuity whenever maxj kj > 1 
because then c«; jumps when x crosses the boundary of certain base b 
elementary intervals. 

Definition 10. Let xi, . . . , x„ G [0, l)"^ and let TZ^ be a 6-ary reflection. 
The folded sequence T^ixi, . . . , Xn) is the sequence zi, . . . , Z2n S [0, l)'^ with 
Zi=Xi for i = 1, . . . , n and Zi = TZ^ixi-n) for i = n + 1, . . . , 2n. 

If Tk{Tk'{xi, . . . ,Xn)) and Ti^'{Tk{xi, . . . are both well deflned, then 
they both have the same points, but possibly in a different order. In this 
sense, folding is commutative. If r folds have been applied, then the sample 
size is perhaps including some points multiple times. 




Fig. I. This figure illustrates some base b = 2 digital reflections as described in the text. 
The left panel shows 8 elementary intervals, one of which contains a solid point with its 
T^{i,2) reflection. The center panel shows 8 elementary intervals, one of which has a point 
with its 7^(3,-) reflection. The right panel shows 4 elementary intervals, one of which 
includes a solid point x with the other three points of its box reflection jTji _)(jr(„_i)(a;)). 
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For folding to improve on a digital net, it should produce a local anti- 
thetic property within elementary intervals of volume comparable to 6''"'". 
To see why, consider the alternatives, taking q = for simplicity. If reflections 
take place within elementary intervals of volume <C then some ele- 
mentary intervals of volume have two nearly identical sampling points, 
while most have none. Conversely, reflections within elementary intervals of 
volume b~'^ ^ 6"™" are not "local enough" to get the best error rate. In par- 
ticular, if r is constant while oo, then one cannot expect an improved 
convergence rate, though the leading constant might be better than without 
folding. 

For K = {ki, . . . , kd) with kj G { — 1, 0, 1, . . .}, let have components k'^ = 
max{A:j,0} and put Ik+I = Yl'j=i ^f- Then for x G 18^+ of volume 6~l'^^l, 
TZk{x) is in the closed elementary interval For reflections of a digital 

net, we should use n with \k~^\ close to m — q. When the reflections get 
finer as m increases, then the reflected scrambled nets will not ordinarily be 
extensible. 

Here we present three methods for inducing local antithetic properties in 
some {q,m, 2) -nets. They are given in increasing order with respect to the 
number of reflections required. 



3.1. Reflection nets. The reflection net takes the form JFk(xi, . . . , 
where xi, . . . , is a (A, q, m, d)-net in base b and k is a vector of d nonnega- 
tive integers summing to g' — m. The reflection net is a (relaxed) (2A, g, m, d)- 
net in base b. 

For d = 2 and q = 0,we use k = (ki, ^2), where each kj = m/2, specifically, 

m+ 1 



(11) 



k^ 



and k2 = m — ki. 



These reflections treat each component of x nearly equally, and reflect within 
elementary intervals of volume 1/n. 



3.2. Box folded nets. The asymptotic error of scrambled net quadrature 
from [21] is governed by the norm of the mixed partial derivative d^''^f. The 
reflection net may be thought of as averaging the function f{x) = {f{x) + 
f{TZ^{x)))/2 over a sample of n values of a scrambled net. The function 
/(x) has a mixed partial derivative almost everywhere, when / does. If 
j S n, then dTZi^{x^) / dx^ = — 1 at almost all points, and we find that mixed 
partial derivatives of / of odd order largely cancel, while those of even order 
are averaged. For d = 2, the dominant term in the error comes from d^^'"^^ f , 
which is of even order and so does not cancel. Therefore, we consider another 
scheme that averages 
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over n points, with ki and /c2 as in (11). To construct these points, we apply 

two folds as in J^(^ki,-)i^{-,k2)(^^^- ■ -^^n))- The image T(^ki-)i^{-M))(^) 
made up of 4 points, symmetric about the center of a box containing x. One 
such quadruple is shown in Figure 1. 

3.3. Monomial nets. A greedier reflection strategy folds together all of 

^(0,m) ; "^(l.TTi-l) ) ^(2,m-2) ) • • • > T^imfl) ■ 

When these m + 1 folds are applied to a (0, m, 2)-net in base 6, the resulting 
points correctly integrate any / that is a sum of piece- wise linear functions 
linear within elementary intervals of volume 6™" or larger. Such "monomial 
nets" extend the local antithetic property of Haber's stratification schemes 
to all elementary intervals of volume b~"^, not just those from one vector 
K. The cost is that the sample size is multiplied by 2™^^, going from 6™ to 
2(26)"^. When 6 = 2 the cost is 2n^ function evaluations instead of 7i. For 
b> 2, the cost grows superlinearly in n, but more slowly than the square of 
n: 

2(26)'" = 2(26)'°S'>(") = 2i+i°g6Wn = 2n^+'°S'>(2). 

4. Example from Sloan and Joe. To illustrate the three locally antithetic 
strategies for nets, we consider an integrand studied by Sloan and Joe [25], 

g{x) =x'^exp{x^x'^), x = {x^,x'^) G [0,1]^. 

This function is bounded and has infinitely many continuous derivatives. We 
can expect it to have all the smoothness that any of the reflection techniques 
discussed above might be able to exploit. Also, there are no symmetries or 
antisymmetries that would make reflection methods exact for this function. 

This function has mean / = Jq Jq g{x^ , x'^) dx^dx'^ = e — 2, and variance 
(7^ = (3 - e)(7e - ll)/8. Using Mathematica, one can find that the ANOVA 
mean squares for the main effects are 

af^y = i((10 - e)e - 15 + 2Ei(l) - 2Ei(2) + log(4)) 

and 

af2} = (3-e)(e-l)/2, 

where Ei is the exponential integral function, Ei(z) = — J^^t~^^e^^ dt. The 
relative variances (sensitivity indices) of the ANOVA terms are 

^2 ^2 ^2 

^ = 0.0729, ^=0.8561 and -^=0.0710. 
cj^ cr^ CJ^ 

This function has a meaningfully large bivariate term accounting for about 
7.1 percent of the variance, and so it is not a nearly additive function. 
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For this paper, we consider a scaled version of g, namely, 

(12) x^exp(.^x^) ^ , = ix\x')e[0,lf. 

e — 2 

With this scaling, / /(x) dx = 1 and so absolute and relative errors coincide. 

All of the integration techniques we consider here are based on the con- 
struction of (0,m, 2)-nets given by Faure [7]. The bases used were b = 
2,3,5,7. The points were either unscrambled, ASM scrambled, or given a 
random linear scrambling. Nested uniform and I-Binomial scrambling were 
not tried because they have the same variance as random linear scrambling. 
For each base and scrambling method, reflection nets, box nets and mono- 
mial nets were tried. 

The monomial nets did not perform very well, most likely because of the 
superlinear (in n) sample size that they required. In some instances they 
were slightly better than the original (0, m, 2)-nets, but not nearly as good 
as the other methods. For the other methods, over values of n up to the first 
power of b larger than 2000, the base 2 methods were almost always the 
best. Accordingly, we work with b = 2 and then extend the computations 
out to re = 2^^. For methods with reflections, the sample sizes go out to 2^^, 
while for box folds, the sample sizes go to 2^^. 

Figure 2 shows the error for this function with the methods described 
above. For deterministic methods, the absolute error is shown. For random- 
ized methods, the root mean squared error from 300 independent replications 
is shown. The upper left panel shows, from top to bottom, the error for un- 
scrambled, random linear scrambled and ASM scrambled Faure points. The 
Faure points lie very close to the 0{n~^) reference line, with no apparent ev- 
idence of a logarithmic factor. The matrix scrambled points are close to the 
0(re~^/^) reference line. The ASM scrambled points seem to follow 0(re-3/2) 
at first, then approach the 0{n~'^) reference before leveling out. 

The upper right panel shows the same three methods, with a reflection 
incorporated. The curve for ASM scrambling keeps crossing the n"^ refer- 
ence line. The curve for random linear scrambling lies just below the re"'^/^ 
reference. The curve for reflection without scrambling has a prominent flat 
spot for re < 32,768. Then it gets much better at 65,536. 

The lower left panel shows the three methods with box symmetry. Here 
the curve for random linear scrambling lies between the references for re"'^/^ 
and and ends up roughly parallel to the latter. The curve for ASM 
scrambling ends up below the re"^ reference line. The curve for the box 
symmetrized Faure sequence follows the one for random linear scrambling, 
but has an error that is not monotone in re. 

For each kind of symmetry, the ASM scrambling seems to give the best 
results on this function. The lower right panel shows all three ASM methods. 
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Un re flee ted Reflected 




T 1 1 1 T 1 1 1 

le+OO 1e+02 16+04 le+06 1e+00 1e+02 1e+04 le+06 

Box reflected All ASM 




1«+00 1e+02 le+04 ie+06 le+OO le+0£ 1e+04 le+oe 



Fig. 2. Shown are absolute errors for the Faure sequence and sample RMSEs from 300 
replications for scrambled versions, in the quadrature example of Section 4- The lower right 
panel is for ASM scrambling: unreflected (solid), reflected (dashed) and box (dotted). The 
other panels depict unscrambled (solid), linearly scrambled (dashed) and ASM scrambled 
(dotted) results. All panels have reference lines proportional to labeled powers of n. 

From top to bottom at the right of that panel they are for the original points, 
reflected points and boxed points. 

From this example it is clear that reflection strategies have potential to 
bring improvements and may even yield a rate better than 0(n~^/^). There 
are also some prominent flat spots and reversals in the errors. In the next 
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sections we investigate box reflections and show that it can improve the error 
rate. 

5. Variance for scrambled digital nets. The error rate analysis for box 
reflection of scrambled digital nets builds on the analysis for unreflected 
scrambled nets. This section recaps some needed material for completeness, 
widens the generality, and corrects an error in the original proof. 

We begin by recapping a base b Haar wavelet multiresolution of functions 
on [0,1)'^. For more details, see [20] and [21]. 

First deflne the univariate mother wavelets for j; G M: 

Mx) = &'/'1lh='= - &"'^'1lxJ=o, c = 0, 1, . . . , 6 - 1. 
The familiar (6 = 2) Haar wavelet decomposition only needs one mother 
wavelet because it has ■00 = — V'l- The general setting considered here re- 
quires more than one mother wavelet. Next, for nonnegative integers k and 
t <b^ define dilated and translated versions for x G [0, 1), 

The functions N and W are indicators of relatively narrow and wide in- 
tervals respectively, where the base b is understood. Each ipktc is a narrow 
rectangular spike minus another one that is b times as wide, but 1/6 times 
as high. 

The wavelets for d>l are tensor products of functions of the form ipktc- 
For li C 1 : let K be a |u|-vector of integers kj > for j G u. Similarly, let 
r be a |ti|-vector of nonnegative integers tj < b^^ for j G u. Notice that for k 
to be well defined a set u must be understood, and r depends similarly on 
both u and k. To avoid cluttered notation, we do not write k{u) or t{u,k). 
The d variate Haar wavelets in base 6 take the form 

with '^{}{){){){x) = 1 by convention. 
The multiresolution of / G ^^[0, l)'^ is 

/(^) = '^'^'^'^{'4'uktgJ)lpuktg{x), 

u K T -y 

ii'uktg, f) = J i^uktgix) f {x) dx, 

where each summation is over all possible values for its argument, beginning 
with all subsets u of {1,. . . ,d}. 
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It is convenient to write f{x) =J2uJ2k'^uk{^), where 

r 7 

The function Vuk{x) is a step function constant within elementary intervals 
of the form B„^K,r- 

If obtained by making a nested uniform (or random linear or 

I-binomial) scramble of points ai, . . . , a„ S [0, 1)*^ in base b, then the variance 

of / = n-iEr=i/(^*) is 

|«|>o K 

where 

,2 _ I ,, /„\2 



and the "gain coefficients" are given by 

2 n n 

^ ' i=li'=lj£u 

From the "multiresolution ANOVA," = J2uJ2k^uk- Therefore, the vari- 
ance of ordinary Monte Carlo sampling has the form (13) with all Fu^^ = 1. 
The variance reduction from randomized nets arises from F^^^ <C 1 for some 
u and K without allowing F^j^^ ^ 1 for any u and n. In particular, if ai, . . . , 
are a (A, g, m, d)-net in base 6, then F^^^ = if m — g > + 

Theorem 2. Let ai, . . . ,an be a {0,m,d)-net in base b>2. Then 

0<r„<(— ) <(— ) <e^2.718. 

Let oi, . . . , a„ be a (A, 0, m, d)-net in base b>2. Then 

0<F„,K< 6 + 1 = 3.718. 
Let oi, . . . , a„ be a (A, q, m, d)-net in base b>2. Then 

b ^^-1 



< F„ . < 5« 



6-1 



Proof. The first part is from [20], the second is from [21], and the third 
is from [22]. □ 

Theorem 2 shows some upper bounds on gain coefficients for nets. Sharper, 
but more complicated bounds are available from intermediate stages of the 
proofs, particularly the ones in [22]. Still sharper bounds are available in 
[18] and in [31]. 
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5.1. Scrambled net variance for smooth functions. There is an error in 
the way that the 0(- • •) terms are gathered in Lemma 1 of [21]. This section 
repairs the proof of the 0{n~^ log(n)'^~^) result for the variance of scrambled 
net integrals of smooth functions. In the process, a more general result is 
obtained, using a weaker definition of smoothness than in the original paper, 
and covering nets with nonzero quality parameter and relaxed versions of 
(A, q, m, (i)-nets. 

The proof follows the lines of [21]. Lemmas 2 and 3 here replace Lemmas 
1 and 2 there, respectively. 

Lemma 2. Suppose that f is a smooth function on [0,1]^^. For b>2 
and uC {l,...,d}, let k and t be \u\-tuples of nonnegative integers with 
components kj and tj < b^^ for j ^u. Then 

(14) \{f,i^u.r,)\<i^) 6-(3|-l + H)/2 sup \d^fu{x)\. 

\ / X&u,^,T 

Proof. From the definitions, 

(/i '4'ukt'y) 

ifuj'^UKT-y) 

(15) = fo-d-l+l"!)/^! /„(a;) J] fe^.+i(iV,^,,^.,^.(x^') - b~^Wu^t,{xi))dx. 

jeu 

Next, fuix) depends on x only through x". Applying (7) to /„, we may 
write 

(16) fu{x) = Y.I d^fu{c-:,:yldy\ 

If f ^ n, then the corresponding term in (16) does not depend on Xu-v and 
is therefore orthogonal to NkjtjCj{x^) — b~^Wkjtjix^) for j £u — v. Accord- 
ingly, we may replace fu in (15) by the v = u term from (16). Also, the 
integrand in (15) vanishes for x ^ B^j^t- Putting these together, we find that 
6(l"l+l"l)/'(/,V'«Kr7) equals 




< sup 




a"/n(0:2/")dy 
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X 



= (2 -2/6)1"! sup 

By Lemma 1, d'^fu is continuous, and so by the mean value theorem, there 
is a point z € Mukt with 



:Vol(rect[Cx"])|a"/n(^) 



<2-H6-l«l|a"/„(z)|. 

The factor b~^^^ is the volume of a |n|-dimensional elementary interval con- 
taining both c^^^ and x"^. The factor 2~!"l arises because c^^^ is at the center 
of this elementary interval and x" is in some sub-interval defined by c^^^ 
and one of the corners of that elementary interval. Finally, 

|(/,V'«.r7)l<(l-l/^)'"'^"^""' + '"'^/' sup \d^fu{z)\. 



□ 



Lemma 3. Under the conditions of Lemma 2, 
(17) cxL<2l"lf^V'"'6-^l'^l||5"/.foc 



Proof. The supports of ^ukt-t and iIjukt'-^' are disjoint unless t = t' , 
and so 



T 7 7' 

Now 



= ^^^{f,'^UKT-i){f,ll^UKT^>) / 'll^UKT-l{x)lljuKT^>{x)dX 

= mm^-^' '^uKT^) if, tpuKry) n (ic,=c; - ^"^) 

T 7 7' jGw 



6-1 \ l"l 

■,=c' 



\c=Oc'=0 
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V y ^ 2GB„,, 

<2l"l(^)""'6-2N||a"/„||L. □ 

Theorem 3. Let xi through x„ 6e i/ie points of a randomized relaxed 
{X,q,m,d)-net in base b. Suppose that as n — > oo with A and q fixed, that all 
of the gain coefficients of the net satisfy < G < oo. Then for smooth f , 

' (logn)*^"^ ' 



V{1) = 



V? 



Proof. If + |u| < m — g, then the digital net property of xi, . . . ,Xn 
yields Tun = 0. Otherwise, we have Tun < G, and so 



(18) < 
where 





2 


ti|>0 K >(m.— 


l«l) + 


n ^ ^ ^ ^ 




w|>0 K >(m— 


i«i)+ 


71 ^ ^ ^ ^ 




|m >0 |K|>{m— g- 


>l)+ 



/ ?) _ 1 \ 3|"l 

\ b J \u\>o 



Because we are interested in the limit as m — > oo, we may suppose that 
m> d + q. For such large m, 

r + \u\ — 1 
\u\ - 1 



\K\>{m~q~\u\)-^. r=m— q— |n|+l 

where the binomial coefficient is the number of |u [-vectors k of nonnegative 
integers that sum to r. Making the substitution s = r — m + q + \u\, 

|K|>(m— |u|)-|- 



j^-2m+2q+2\u\ ^ ^-2s / S + m — q — 1\ 
s=l ^ ^ ' 



oo 



X2 /,2g+2|M| 

[\u\ - 1)! ^ 
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< :Lfc2(q+|nhl) V- jm-g-ljl I 2(s-l) 7 

<:^|^|52(.+ |«|-l)^H-1^5-2(s-l)^|«|-l 

(19) =0(n-2log(n)l"l-^), 

because the infinite sum converges, m < log^(n) and \u\ < d. The theorem 
fohows upon substituting the bound (19) into (18). □ 

6. Scrambled net variance with box folding. This section investigates 
the effects of reflection schemes on scrambled net variance. Reflections are 
written as T^p, where p \s a. d vector of integers > — 1. As before, we let k 
denote a scale for the multiresolution analysis. 

In Section 5.1 the coefficients {f,ipuK,T'y) are bounded in terms of mixed 
partial derivatives of / taken once with respect to each component for 
j S u. Reflection is a piece-wise differentiable operation. The function TZp{x) 
is discontinuous at x ii x^ = tb~^^ holds for some j with rj > and some pos- 
itive integer t < b"^^ . In the interior of the pieces, reflection of x^ reverses the 
sign of the derivative with respect to x^ . This sign reversal can be exploited 
to produce a cancellation effect that reduces a bound on (/, ipuKr^f) ■ 

To simplify some expressions, we define the composite function fP by 
/^(x) = f{TZp{x)). At almost all points x G [0, l]'^ the chain rule gives 

(20) d-f'\x) = {-ir^^^f'^d-f{Tlp{x)), 

where sgn(p) = X]j=i lrj>o counts the number of reflections in p. The factor 
f{JZp{x)) in the right-hand side of (20) is the partial derivative of /, 
evaluated at the point z = Tlp{x), and not the partial derivative of / o TZp 
evaluated at x, which appears on the left-hand side. 

Definition 11. In d dimensions, a box folding scheme is an average 
of 2*^ reflections as described below. Start with p = {ri, . . . ,rd), where each 
rj > 0. For £ = 0, . . . ,2'^ — 1, let p£ be the d vector of integer components 
Vij G {rj,— 1} with rij = rj if and only if the jth base 2 digit of £ is one. 
Then the box fold scheme is 

'=5. s-i: /"(-.)=-!:/>.). 

£=0 1=1 1=1 

wheie fix) =2-^j:t'o' f'i^)- 
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Sometimes it is more convenient to index the reflections by 2^ subsets 
t) C {1, . . . , d}. Let V = v{t) denote the subset where j G w if and only if the 
jth binary digit of £ is a one. Taking to mean pn where v = v{i), we 
may write f{x) = '2~'^J2vCi:df^''{x)- From the definition of v, we find that 
sgn(p,) = (-l)M. 

To get ANOVA components of /, we need the ANOVA components of 
ff. Lemma 4 below shows that reflection commutes with the operation of 
taking ANOVA components. 

Lemma 4. Let f be an function on [0, l]'^. Let fP{x) = f{7lp{x)), 
where p is a d vector of integers rj > —1 for j = 1, . . . ,d. Let u Q {1, . . . ,d} . 
Then 

(21) fp{x) = fu{np{x)). 

Proof. The proof follows by induction on \u\. □ 

The bounds for {f,ipuKT"/) in Section 5.1 made use of differentiability of 
/, which we cannot assume for The derivation as far as equation (15) 
does follow for fP and so {fP,ipuKT-y) equals 

(22) ft-d-l+I^D/^l fP{x)l[b''^+HNk,t,c,{x^)-b-'Wk^t,{x^))dx. 

The next step in the derivation of bounds for {f,ipuK.T'y) required d^f at 
points oiMuKT, and d^f^ does not necessarily exist. 

The setting is simplest if the scale k is finer than the refiection p. Suppose 
that u = {1, . . . ,d} and that kj > rj for j = 1, . . . ,d. This specifically includes 
cases with rj = —1 that designate no reflection for component j. Then, for 
smooth /, d^fp is uniformly continuous on the interior of B^^t- Letting c^kt 
be the center of B^kt as before, we find that 

= I I a"/„^(0:y")dy>,«,^(x)(ix 

= (_l)«gn(p) 1 1 d-f^{c-:,:llp{yT)dy^^^^,,{x)dx 

(23) ={-lf^<P) j j dyu{npiy))dy^Pu.ryix)dx, 



where at the last step we use u=\:d and —u = 0. 
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Lemma 5. Suppose that f is a doubly smooth function on [0,1] . Let 
p= (ri , . . . , rd) with integers rj > 0. Take \p\ = X)j=i > '^'^^ / be defined 
by the box folding scheme of Definition 11. For b>2 and u = {1, . . . ,d}, 
let K, T and 7 be d-tuples of nonnegative integers with components kj > rj, 
tj < b^^ , and cj < b respectively, for j = 1, . . . ,d. Then 

(24) <6"IPl(^^^ V(3|«l+l«l)/2||a«'"/„|U. 

Proof. Because k is on a finer scale than all of the reflections p£, equa- 
tion (23) holds for each of them. Therefore, 

{f,'^uKT^) = ^ f j ^ {-l)^^'^d''fu{'Rp^{y))dy'il}uKTt{x)dx. 

For y G [0, 1]"^, let k = k{y) S [0, l]*^ be the center point through which the 
reflection TZp with p = {ri, . . . , r^) operates on y. That is, k-' = b~^'^ ( [b'^^y^ \ + 
1/2). Because k is finer than p, the same center k applies for all y G [c„kt-)2;]. 
Then the jth component of TZp^{y) is 2k-' — y^ if j € v and is y^ otherwise. 
Therefore, 

^ (-l)l-l9-/n((2k - yf : y-^) = Vol(rect[y, 2k - y])9"'"/„(z), 

where z = z{y) G rect[y, 2k — y]. The volume of rect[y, 2k — y] is at most b^^P^ 
and so following the argument from Lemma 2, 

il^Punr^) < (1 - l/6)-VI''l6-(3hl + |n|)/2||^«,«^^||^^ □ 

The factor 6~l^l in (24) underlies the improvement that a box reflec- 
tion can bring. For a scrambled {X,q,m,d)-net in base 6, if we choose 
p so that \p\ = m — q, then the coefficients {f ,'ipuKT'y) with k finer than 
p are 0(6~3|k|/2-|p|^ instead of 0(6~^l'^l/^). Coarse terms with \k\ + \u\ < 
m — q do not contribute to the error, so the dominant error terms have 
|K|-|-|ti|=m — (7-I-I. In the next theorem we will deal with those terms 
by taking \p\=m — q. Choosing \p\ = m — q, the largest contributing coeffi- 
cients are 0(5-3|^l/2-|pl) = o(5-3m/2-m) ^ 0{n~^/^) instead of 0(6"3|''l/2) ^ 
QQ^-3m/2^ _ (9(7^-3/2^^ Following the derivation in Section 5.1, the terms cr^^ 
are then of order 0{b~^"^) = 0{n~^) instead of 0(6~2m) = 0(n~2) and so 
each of them contributes 0(n~^) to the variance instead of 0{n"^). The 
variance under box folding does not generally end up as 0(n~'^'^^) though, 
because there are also contributions from terms k where k is not finer than 
P- 
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Theorem 4. Let xi through x„ be points of a randomized relaxed (A, q, m, 
d)-net in base b. Suppose that the quality parameter q remains fixed as n 
tends to infinity through values Xb^ for fixed A and that none of the gain co- 
efficients of the net is larger than G < oo. Then for doubly smooth f , under 
box folding 6t/ p = (ri, . . . , r^) where 

\{m — q) / d\ + 1, j < {m — q) — d[{m — q)/d\ 
[(m — q)/d\, otherwise, 



we find that 

V{I) = 



(log n) 



d~i 



as n —> OO . 



Proof. First we consider coefficients {f ,''puKT-f) for the liighest order 
subset u = {!,... Let w = w{k) = {j G u | kj > rj}. If w = 0, then 
Z^jGu — EiGu('"j — i)='m — Q — d. Then \k,\ + \u\ =m — q, so that Tun = 
by the balance property of the digital net. Therefore, we restrict attention 
to w with \w\ > 0. Lemma 5 treated the case with w = u and with k finer 
than p. 

For X in the support of ipuKT^y, the function / is differentiable with respect 
to x^ for j E w. We may apply equation (7) to each fP^, keeping only the 
term because the others are orthogonal to ipuKT'y- The result shows that 



Y fu{x)llJuKr-i{x)dx 

= [ J2 [ d^fP^{x-^:y-')dy^^l;unr,{x)dx 
(25) = I I 5^ a-/r^""^(x-:y-)dy-A.Kr7(^)dx, 

VlQ~W \V2<^W 

after decomposing v into its intersections vi and V2 with w and —w respec- 
tively. 

The summation inside of (25) may be written as 

= Vol(rect[y-, 2k- - y"'])9"''"'/„(7^p„^,„^ (x)- : z^), 

where for j G w, = b~'-^{[¥^y^\ + 1/2) and z'^ € rect[y"', 2k^^ - y""]. Be- 
cause Vol(rect [y^ , 2k^^ — y^] ) < b ^js^ , we find that box reflection results 
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in a coefficient {f ^i^uKT-y) with an upper bound on the order of h ^jem^i 
smaher than the bound for (/, iI^uktj) ■ 

This coefficient reduction is 6"^e»''^ = 0(6-""'"'l/'^) = 0(n-'"'l/'^). Be- 
cause we only need to consider nonempty w, the reduction is 0{n~^/'^). The 
effect is to reduce the bound for a^^ by 0(n~^/'^) and then the same counting 
argument as in Theorem 3 shows that the contribution of to the variance 
is 0((logn)'^-V?i3+2/rf). 

Now consider variance contribution of for v C {l,...,d} with 1 < 
l^l < d. The sum {l/n)J2i'=ifv{xi) is a box fold of a scrambled relaxed 
{\b'^~\^\ q,m, \v\)-net in base b for estimating the mean of the fully \v\- 
dimensional function g{x") = fv{x^ '■ 0"") obtained by ignoring the —v com- 
ponents of X. Accordingly, it makes a variance contribution that is 
0((logn)l"l~^/n'^^^/l''l). The variance of the sum cannot be of higher order 
than 0((logn)'^-V?^^^^^'')- ^ 

7. Discussion. In this paper we have seen that scrambled net quadrature 
can be profitably combined with antithetic sampling to reduce variance. 
This result then fits in with the work of [12] who combined quasi-Monte 
Carlo with control variates and [27] and [3] who both looked at quasi-Monte 
Carlo in combination with importance sampling. The best numerical results 
were for ASM scrambling combined with box reflections, but we have no 
theoretical results for that combination. 

The foldings of scrambled nets studied here may also be viewed as a hybrid 
of digital nets and a monomial cubature rule. The 2'^-fold symmetry used by 
box folding takes each sample point in the net and uses it to generate the 
points of a cubature. It is one of many cubature rules that might be made 
to work with digital nets. For background and catalogues of cubature rules, 
see [4, 5] and [28]. 

The conclusions of Theorems 3 and 4 both hold if A and q are allowed to 
fluctuate as n increases, so long as both remain below finite upper bounds. 

A larger improvement from local antithetic sampling may be possible if 
we can identify s < d input variables that are much more important than 
the others, and apply reflections only to them. In some cases we can even 
re-engineer the integrand to make a small number of variables much more 
important than they are in the nominal encoding. For an example of such 
a technique with an integrand with respect to a high dimensional geometric 
Brownian motion, see [1] and [2]. Many more examples are presented in [9]. 
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